title: “cm007 Exercises: Exploring Aesthetic Mappings” output: github_document
library(gapminder)
library(ggplot2)
Switch focus to exploring aesthetic mappings, instead of geoms.
gdpPercap vs pop with a categorical variable (continent) as shape.gvsl <- ggplot(gapminder ,aes(gdpPercap,lifeExp)) + scale_x_log10()
gvsl + geom_point(aes(shape=continent), alpha = 0.2)
pch?gvsl + geom_point(shape = 7)
gvsl + geom_point(pch = 7)
gvsl + geom_point(shape = "$")
List of shapes can be found at the bottom of the scale_shape documentation.
Make a scatterplot. Then:
gvsl + geom_point(aes(colour = continent))
colour and color.trans="log10" for log scale.gvsl + geom_point(aes(colour = pop)) + scale_colour_continuous(trans = "log10")
gvsl + geom_point(aes(colour = lifeExp > 60))
Make a line plot of gdpPercap over time for all countries. Colour by lifeExp > 60 (remember that lifeExp looks bimodal?)
Try adding colour to a histogram. How is this different?
ggplot(gapminder, aes(lifeExp)) + geom_histogram(aes(fill = continent))
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
Make histograms of gdpPercap for each continent. Try the scales and ncol arguments.
ggplot(gapminder, aes(lifeExp)) +
facet_wrap( ~ continent, scales = "free_x") +
geom_histogram()
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
Remove Oceania. Add another variable: lifeExp > 60.
ggplot(gapminder, aes(gdpPercap)) +
facet_grid(continent ~ lifeExp > 60) +
geom_histogram()
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
size aesthetic to a scatterplot. What about cex?gvsl + geom_point(aes(size = pop), alpha = 0.2) +
scale_size_area() # size of the bubbles is proportional to the population
scale_radius() and scale_size_area(). What’s better?shape=21 to distinguish between fill (interior) and colour (exterior).gvsl + geom_point(aes(size = pop, fill = continent), shape = 21, color = "black", alpha = 0.2)
Let’s try plotting much of the data.
gvsl + geom_point(aes(size = pop, color = continent)) +
scale_size_area() + # area proportional to pop.
facet_wrap(~ year)
x and y aesthetics)Let’s see how Rwanda’s life expectancy and GDP per capita have evolved over time, using a path plot.
geom_line(). Try geom_point().arrow=arrow() option.geom_text, with year label.Try cyl (number of cylinders) ~ am (transmission) in the mtcars data frame.
geom_count().geom_bin2d(). Compare with geom_tile() with fill aes.Try a scatterplot with:
geom_hex()geom_density2d()geom_smooth()How many countries are in each continent? Use the year 2007.
After filtering the gapminder data to 2007, make a bar chart of the number of countries in each continent. Store everything except the geom in the variable d.
Notice the y-axis. Oddly, ggplot2 doesn’t make it obvious how to change to proportion. Try adding a y aesthetic: y=..count../sum(..count..).
Uses of bar plots: Get a sense of relative quantities of categories, or see the probability mass function of a categorical random variable.
coord_polar() to a scatterplot.If you’d like some practice, give these exercises a try
Exercise 1: Make a plot of year (x) vs lifeExp (y), with points coloured by continent. Then, to that same plot, fit a straight regression line to each continent, without the error bars. If you can, try piping the data frame into the ggplot function.
Exercise 2: Repeat Exercise 1, but switch the regression line and geom_point layers. How is this plot different from that of Exercise 1?
Exercise 3: Omit the geom_point layer from either of the above two plots (it doesn’t matter which). Does the line still show up, even though the data aren’t shown? Why or why not?
Exercise 4: Make a plot of year (x) vs lifeExp (y), facetted by continent. Then, fit a smoother through the data for each continent, without the error bars. Choose a span that you feel is appropriate.
Exercise 5: Plot the population over time (year) using lines, so that each country has its own line. Colour by gdpPercap. Add alpha transparency to your liking.
Exercise 6: Add points to the plot in Exercise 5.